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o 

rsq Abstract. Moss and Rabani [13] study constrained node-weighted Steiner tree 

, problems with two independent weight values associated with each node, namely, 

O cost and prize (or penalty). They give an 0(logn)-approximation algorithm for 

the prize-collecting node-weighted Steiner tree problem (PCST) — where the goal 
is to minimize the cost of a tree plus the penalty of vertices not covered by 

00 the tree. They use the algorithm for PCST to obtain a bicriteria (2, O(logn))- 
Cn approximation algorithm for the Budgeted node-weighted Steiner tree problem — 
^_^ where the goal is to maximize the prize of a tree with a given budget for its cost. 
ry^ Their solution may cost up to twice the budget, but collects a factor n(-g^^) of 
/^ the optimal prize. We improve these results from at least two aspects. 

^~J Our first main result is a primal-dual O (log ft) -approximation algorithm for a 

tZ5 more general problem, prize-collecting node-weighted Steiner forest (PCSF), 

1 ^1 where we have h demands each requesting the connectivity of a pair of vertices. 

Our algorithm can be seen as a greedy algorithm which reduces the number of 
7—i demands by choosing a structure with minimum cost-to-reduction ratio. This nat- 

^ ural style of argument (also used by Klein and Ravi [1 1] and Guha et al. [9]) leads 

^^ to a much simpler algorithm than that of Moss and Rabani [13] for PCST. 

Our second main contribution is for the Budgeted node-weighted Steiner tree 
problem, which is also an improvement to Moss and Rabani [13] and Guha et 
al. [9]. In the unrooted case, we improve upon an 0(log^ n)-approximation of 
^^ [9], and present an 0(log n) -approximation algorithm without any budget viola- 

^^ tion. For the rooted case, where a specified vertex has to appear in the solution 

e^ tree, we improve the bicriteria result of [13] to a bicriteria approximation ratio of 

\ I (1 -I- e, 0(logn)/e^) for any positive (possibly subconstant) e. That is, for any 

^ permissible budget violation 1 + e, we present an algorithm achieving a tradeoff 

in the guarantee for prize. Indeed, we show that this is almost tight for the natural 
linear-programming relaxation used by us as well as in [13]. 



1 Introduction 

In the rapidly evolving world of telecommunications and internet, design of fast and 
efficient networks is of utmost importance. It is not surprising, therefore, that the field 
of network design has continued to be an active area of research since its inception 
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several decades ago. These problems have applications not only in designing computer 
and telecommunications networks, but are also essential for other areas such as VLSI 
design and computational geometry [3]. Besides their appeals in these applications, 
basic network design problems (such as Steiner Tree, TSP, and their variants) have been 
the testbed for new ideas and have been instrumental in development of new techniques 
in the field of approximation algorithms. 

In parallel to the study by Moss and Rabani [13], this work focuses on graph- 
theoretic problems in which two (independent) nonnegative weight functions are as- 
sociated with the vertices, namely cost c{v) and prize (or penalty) tt{v) for each vertex 
V of the given graph G{V, E). The goal is to find a connected subgraph H of G that 
optimizes a certain objective. We now summarize the four different problems, already 
introduced in the literature. In the Net Worth problem (NW), the goal is to maximize the 
prize of H minus its cost\ We prove in Appendix A that this natural problem does not 
admit any finite approximation algorithm. A similar, yet better- known objective is that 
of minimizing the cost of the subgraph plus the penalty of nodes outside of it (which is 
called Prize-Collecting Steiner Tree (PCST) in the literature). Two other problems arise 
if one restricts the range of either cost or prize in the desired solution. In particular, the 
Quota problem tries to find the minimum-cost tree among those with a total prize sur- 
passing a given value, whereas the Budgeted problem deals with maximizing the prize 
with a given maximum budget for the cost. The rooted variants ask, in addition, that a 
certain root vertex be included in the solution. In the /s-MST problem, the goal is to find 
a minimum-cost tree with at least k vertices. In the fc-STElNER Tree problem, given a 
set of terminals, the goal is to find a minimum-cost tree spanning at least k terminals. 
We show the following reductions missing from the literature. 

Theorem 1. Let a, < a < 1, be a constant. The following statements are equivalent 
(both for edge-weighted and node-weighted variants): 

i There is an a-approximation algorithm for the rooted k-MST problem, 
a There is an a-approximation algorithm for the unrooted fc-MST problem. 
Hi There is an a-approximation algorithm for the fc-STEINER TREE problem. 

Proof. Here we present the equivalence of (ii) and (iii) (see Appendix B for that of (i) 
and (ii)). We note that one way is clear by definition. To prove that (iii) implies (ii), 
we give a cost-preserving reduction from fc-STElNER TREE to fc-MST. Let < G = 
{V, E), T,k > be an instance of fc-STElNER TREE with the set of terminals T C y. 
Let n = \V\. For every terminal vt G T, add n vertices at distance zero of vt. Let 
k' = kn-\- k and consider the solution to fc'-MST on the new graph. Any subtree with 
at most fc — 1 terminals have at most (fc — l)n + n — 1 = fcn — 1 vertices. Therefore 
an optimal solution covers at least k terminals. Hence the reduction preserves the cost 
of optimal solution. D 

These results improve the approximation ratio for fc-Steiner tree. Previously, a 4- 
approximation algorithm was proved by [14] and a 5-approximation algorithm was due 
to [4] who had also conjectured the presence of a 2 + e-approximation algorithm. The 
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equivalence of /c-Steiner tree and /c-MST combined with the 2-approximation result of 
Garg [7] leads to a 2-approximation algorithm for fc-Steiner tree. 

A more tractable version of the prize-collecting variant is the edge-weighted case 
in which the costs (but not the prizes) are associated with edges rather than nodes. The 
best known approximation ratio for the edge-weighted Steiner tree problem is 1.39 due 
to Byrka et al. [5]. For the earlier work on edge-weighted variant we refer the reader to 
the references of [5]. In this paper, unless otherwise specified all our graphs are node- 
weighted and undirected. 

1.1 Contributions and Techniques 

Approximation algorithm for PCSF. Klein and Ravi [11] were the first to give an 
0(log /i) -approximation algorithm for the SF problem. Later, Guha et al. [9] improved 
the analysis of [1 1] by showing that the approximation ratio of the algorithm of [1 1] is 
w.rt. the fractional optimal solution for the ST problem. The ST problem is a special 
case of SF where all demands share an endpoint. Very recently and independently of 
our work, Chekuri et al. [2] give an algorithm with an approximation ratio of 0(log n) 
w.rt. to the fractional solution for SF and higher connectivity problems. This immedi- 
ately provides a reduction from PCSF to the SF problem: one can fractionally solve the 
LP for PCSF and pay the penalty of every demand for which the fractional solution pays 
at least half its penalty. Hence, the remaining demands can be (fractionally) satisfied by 
paying at most twice the optimal solution. Therefore, one can make a new instance of 
SF with only the remaining demands and get a solution within O(logn) factor of the 
optimal solution using the SF algorithm. 

We start off by presenting a simple primal-dual O (log /i) -approximation algorithm 
for the node-weighted prize-collecting Steiner forest (PCSF) problem where h is the 
number of connectivity demands — see Theorem 2. Compared to the PCST algorithm 
given by Moss and Rabani [13] and Konemann et al. [12], our algorithm for PCSF 
solves a more general problem and it has a simpler analysis. A reader familiar with 
the moat-growing framework"* may recall that algorithms in this framework (e.g., that 
of Moss and Rabani [13] or Konemann et al. [12]) consist of a growth phase and a 
pruning phase. A moat is a set of dual variables corresponding to a laminar set of 
vertices containing terminals — vertices with a positive penalty. The algorithm grows the 
moats by increasing the dual variables and adding other vertices gradually to guarantee 
feasibility. In the edge-weighted Steiner tree problem, when two moats collide on an 
edge, the algorithm buys the path connecting the moats and merges the moats. Roughly 
speaking, the algorithm stops growing a moat when either it reaches the root, or its total 
growth reaches the total prize of terminals inside it. This process is not quite enough to 
obtain a good approximation ratio. At the end of the algorithm we may have paid too 
much for connecting unnecessary terminals. Thus as a final step one needs to prune the 
solution in a certain way to obtain the tight approximation ratio of 2 — - . 

In the node-weighted problem, one obstacle is that (polynomially) many moats may 
collide on a vertex. Handling the proper growth of the moats and the process to merge 
them proves to be very sophisticated. This may have been the reason that for more than 
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a decade no one noticed the flaw in the algorithm of Moss and Rabani [13]''. Indeed 
the recently proposed algorithm by Konemann et al. [12] is even more sophisticated. 
In our algorithm, not only do we completely discard the pruning phase, but we also 
never merge the moats (thus intuitively, a moat forms a disk centered at a terminal). In 
fact, our algorithm can be thought of as a simple greedy algorithm. Our algorithm runs 
in iterations, and in each iteration several disks are grown simultaneously on different 
endpoints of the demands. The growth stops at the largest possible radius where there 
are no "overlaps" and no disk has run out of "penalty." If the disks corresponding to 
several endpoints hit each other, a set of paths connecting them is added to the solution 
and all but one representative endpoint are removed for the next iteration. However, if a 
disk is running out of penalty, the terminal at its center is removed for the next iteration. 
The cost incurred at each iteration is a fraction of OPT, proportional to the fraction of 
endpoints removed, hence the logarithmic term in the guarantee. 

Although our primal-dual approach is different from the approach known for 
SF [11,9], we indeed use the same style of argument to analyze our algorithm. The crux 
of these algorithms is to reduce the number of components of the solution by using a 
structure with minimum cost-to-reduction ratio. Besides the simplicity of this trend, it 
is important that by avoiding the pruning phase, these algorithms may lead to progress 
in related settings such as streaming and online settings. The moat-growing approach 
of Konemann et al. [12], however, allows a stronger lagrangian-preserving guarantee ^ 
for PCST. This property is shown to be quite important for solving various problems 
such as fc-MST and fc-Steiner tree (see e.g. [4,10]). 

Approximation algorithms for the Budgeted problem. Using their algorithm for PCST, 
Moss and Rabani developed a bicriteria' approximation algorithm for the Budgeted 
problem, one that achieves an approximation factor O(logn) on prize while violating 
the budget constraint by no more than factor two [13]. We present in Theorem 3 a 
modified pruning procedure that improves the bicriteria bound to (1 + e, 0(log n)/e^); 
in other words, if the algorithm is allowed to violate the budget constraint by only 
a factor 1 + e (for any positive e), the approximation guarantee on the prize will be 
0{\ogn)/e^. In fact, we also show using the natural linear-programming relaxation 
(used in [13] as well), that it is not possible to improve these bounds significantly — 
see Appendix C. In particular, there are instances for which the fractional solution is 
OPT/e, however, no solution of cost at most 1 + e times the budget has prize more than 
O(OPT). Our integrality-gap construction fails if the instance is not rooted. Indeed, in 
that case, we show how to obtain an 0(log n) -approximation algorithm with no budget 
violations — see Theorem 4. This improves the 0(log^ n) -approximation algorithm of 
Guha et al. [9].** To get over the integrality gap of the LP formulation, we prove several 



' In private correspondence the authors of the original work have admitted that their algorithm 

is flawed and that it cannot be fixed easily. 
^ Let T denote the sets of vertices purchased by the algorithm of [12]. It is guaranteed that 

c(r) + log(n)7r(V'\r) < log(n)OPT. 
^ An {a, /3) -bicriteria approximation algorithm for the Budgeted problem finds a tree with total 

prize at least i fraction of that of optimal solution and total cost at most a factor of the budget. 
* The 0(log^ n)-approximation algorithm can be derived from the results in [9] with some 

efforts, not as explicitly as cited by Moss and Rabani [13]. 



structural properties for near-optimal solutions. By restricting the solution to one with 
these properties, we use a bicriteria approximation algorithm as a black box to find a 
near-optimal solution. Finally we use a generalization of the trimming method of [9] to 
avoid violating the budget. 

1.2 Organization 

Next in Section 2 we briefly discuss the method of Moss and Rabani for deriving an 
algorithm for the budget problem from that for PCST. We then explain and analyze 
our algorithm for PCSF in Section 2.2. Section 3 discusses our trimming procedure and 
how it leads to improved results for Budgeted problems. Finally, the appendices contain 
minor results for hardness of NW and reductions between special cases of the Quota 
problem, as well as omitted proofs. 

2 The Prize-Collecting Steiner Forest Problem 

The starting point of the algorithm of Moss and Rabani [13] is a standard LP relaxation 
for the rooted version. For the Quota and Budgeted problems they show that any (frac- 
tional) feasible solution can be approximated by a convex combination of sets of nodes 
connected (integrally) to the root. Given the support of such a convex combination, it 
follows from an averaging argument that a proper set can be found. Thus the problem 
comes down to finding the support of the convex combination. They show that given 
a black-box algorithm which solves the PCST problem with the approximation factor 
0(log n), one can obtain the support in polynomial time. 

The main result of this section is a very simple, and maybe more elegant algorithm 
for the classical problem of PCSF (and thus PCST). As mentioned before, using moats 
and having a pruning phase lead to the main difficulty in the analysis of previous algo- 
rithms. These seem to be a necessary evil for achieving a tight constant approximation 
factor for the edge-weighted variant. Surprisingly, we show neither is needed in the 
node-weighted variant. Instead of moats, we use dual disks which are centered on a 
single terminal and we do not need a pruning phase. 

2.1 Preliminaries 

Consider a graph G = (V, E) with a node-weight function c : V ^> ]R>o- For a subset 
S C V, let c{S) :— J2vgs '^i^)- ^^ '■^^ Steiner Forest problem, given a set of demands 
£ = ((si, ii), . . . , {sh,th)), the goal is to find a set of vertices X such that for every 
demand i e [h], Si and ti are connected in G[X]. The vertices Si and ti are denoted as 
the endpoints of the demand i. In PCSF a penalty (prize) tt^ G Il>o is associated with 
every demand i £ [h]. If the endpoints of a demand are not connected in the solution, 
we need to pay the penalty of the demand. The objective cost of a solution X CV is 

PCSF(X) = C{X) + Y^ TT,. 

i^[h]:i is not satisfied 



A terminal is a vertex which is an endpoint of a demand. Let T denote the set 
of terminals. We may assume that the cost of a terminal is zero. We also assume the 
endpoints of all demands are different'^ (thus \T\ = Ih). For a pair of vertices u and v 
and a cost function c, let dP{u, v) denote the length of the shortest path with respect to 
c connecting u and v, including the cost of endpoints. 

For a set of vertices S let 5{S) denote the set of vertices that are not in S but have 
neighbors in S. A set S separates a demand i if exactly one of Sj and ti is in 5*. Let 
Si denote the collection of sets separating the demand i and let 5 = IJ^ Si. For a set 
S, define the penalty of S as half of the total penalty of demands separated by S, i.e., 
T^c{S) = i X]iS65 ""i- ^^ '^^y '^'"OP '^he index C when there is no ambiguity. The 
PCSF problem can be formulated as the following standard integer program (IP): 

Minimize ^ c{v)y.{v) + ^ 7r(S')z(5) 

vi^v\T ses 

yi£[h],S eS^ J2 ^(^) + J2 ^(^) - 1 

ves{s) R\scRes, 

^iv),z{S) e {0,1} 

Given a solution X C 1/ to the PCSF problem one can easily make a feasible solution 
X to the IP with the same objective value as PCSF(X): since the cost of a terminal is 
zero, we assume T Q X. For every vertex v € X sti x(u) = 1 and for every connected 
component CC of G[X] set z(l/\CC) = 1. It is also easy to verify since the cost of 
a terminal is zero, any (integral) feasible solution x corresponds to a solution X C V 
for the PCSF problem with (at most) the same cost. One may relax the IP by allowing 
assignment of fractional values to the variables. Let OPT denote the objective value 
of the optimal solution for the relaxed linear program (LP). The following is the dual 
program V corresponding to the relaxed LP. 

~ (X>) 



Maximize 


Ly(^) 




Ses 


WveV 


> ; yis) < c{v) 




ses:ves{s) 


WS eS 


E E y^(S')<7r{S) 




S'CSi-.s.S'es, 




y,:(^)>0,y(5)= > ; y.iS) 




f.seSi 



In the case of Steiner tree, the dual variables are defined w.r.t. a set S. However, in 
Steiner forest, the dual variables are in the form yi{S), i.e., they are defined based on 
a demand as well. This has been one source of the complexity of previous primal-dual 
algorithms for Steiner forest problems. Interestingly, in our approach, we only need to 
work with a simplified dual constructed as follows. 



' Both assumptions are without loss of generality. For every demand (s^, i,), attach a new vertex 
s' of cost zero to Si and similarly attach a new vertex t' of cost zero to ti. Now interpret i as 
the demand between s* and i\ The optimal cost does not change. 



Cores and Simplified Duals Let c and C denote a node-weight function and a set of 
demands, respectively. Let Z^ denote the set of vertices with zero cost. We note that the 
terminals are in Z^- A set C C T/ is a core if C is a connected component of G[Z^ and 
contains a terminal (i.e., an endpoint of a demand in C). Let 5(c, C) be the collection 
of sets separating one core from the other cores, i.e., a set S is in 5(c, C) if S contains a 
core but has no intersection with other cores. For a set 5 G 5(c, £), let core(5) denote 
the core inside S. Note that ttc{S) = 7r£(core(S')). A simplified dual w.rt. c and C is 
the following program V{c, C). 

Maximize ^ y{S) (V{c,C)) 

ses 

VveV J2 y('5') < c{v) (CI) 

ses{c,c)-ves{s) 

ysesi^) Y. y{s')<T^c{s) (C2) 

S':core(S)CS'CS 

y(^) > 



Observe that 5(c, C) C S. Indeed 2?(c, £) is the same as V with only (much) fewer 
variables. Thus the program Vic. C) is only more restricted than V. In the rest of the 
paper, unless specified otherwise, by a dual we mean a simplified dual. When clear from 
the context, we may omit the indices c and C. 

Disks Consider a dual vector y initialized to zero. A disk of radius R centered at a 
terminal t is the dual vector obtained from the following process: Initialize the set S to 
the core containing t. Increase y{S) until for a vertex u the dual constraint CI becomes 
tight. Add M to 5 and repeat with the new S. Stop the process when the total growth (i.e., 
sum of the dual variables) reaches R. A disk is valid if y is feasible. In what follows, 
by a disk we mean a valid disk unless specified otherwise. 

A vertex v is inside the disk if d''{t, v) is strictly less than R. The continent of a disk 
is the set of vertices inside the disk. Further, we say a vertex v is on the boundary of a 
disk if it is not inside the disk but has a neighbor u such that d''{t, u) < R. Note that u is 
not necessary inside the disk. See Figure 1 for a graphical representation of a disk. The 
following facts about a disk of radius R centered at a terminal t can be derived from the 
definition: 

Fact 1. The (dual) objective value of the disk is exactly R. 

Fact 2. For every vertex inside the disk, the dual constraint CI is tight. 

Fact 3. If a set S does not include the center, then y{S) = 0. Further, if S is not a 
subset of the continent, then y{S) = 0. 

Let yi , . . . , yfe denote a set of disks. The union of the disks is simply a dual vector 
y such that y{S) = J2i Yii'^) for every set S C S. A set of disks are non-overlapping 
if their union is a feasible dual solution (i.e., both set of constraints CI and C2 hold). If 
a vertex v is inside a disk, the corresponding dual constraint is tight. Thus for any set 
S such that v € <5(<S'), the dual variable y{S) cannot be increased. On the other hand 




Fig. 1. A graphical representation of a disk of radius 10. The vertex at the center of the disk is an 
endpoint of a demand. The numbers show the cost of vertices. The inner-most circle contains the 
core, while the outer-most circle contains the continent and the boundary. 



since the distance between v and the center is strictly less than the radius, there exists a 
set containing v with positive dual value. This observation leads to the following. 

Proposition 1. Let y be the union of a set of non-overlapping disks yi, . . . , y^. A ver- 
tex inside a disk cannot be on the boundary of another disk. 

Proposition 1 implies that in the union of a set of non-overlapping disks, the con- 
tinents are pairwise far from each other. This intuition leads to the following (proof in 
Appendix D). 

Lemma 1. Suppose T' is a subset of terminals such that the distance between every 
pair of them is non-zero. Let R denote the maximum radius such that the \T'\ disks of 
radius R centered at terminals in T' are non-overlapping. Consider the union of such 
disks. Either (i) the constraint C2 is tight for a continent; or (ii) the constraint CI is 
tight for a vertex on the boundary of multiple disks. 

The final tool we need for the analysis of the algorithm states a precise relation 
between the dual variables and the distance of a vertex on the boundary. The proof is 
based on the analysis of the growth of a disk which is presented in Appendix D. 

Lemma 2. Let v be a vertex on the boundary of a disk y of radius R centered at a 
terminal t. We have '^s\veS{S) vi^) = R~ (d^it, ^) ~ c{v)). 



2.2 An algorithm for the PCSF problem 

The algorithm finds the solution X iteratively. Let Xi denote the set of vertices bought 
after iteration i where Xo is the set of terminals. For every i, the modified cost fiinction 
Ci is a copy of c induced by setting the cost of vertices in Xi^i to zero, i.e., Ci = 
c[Xi_i — > 0]. At iteration i there is a set of active demands £,; C £ and the dual 
program 2?j = 'D{ci,Ci). The program 2?j is the simplified dual program w.rt. the 
modified cost function and the active demands. Note that Vi is stricter than V, thus the 
objective value of a feasible solution to Vi is a lower bound for OPT. The algorithm 
guarantees that for every i < j, Xi C Xj and Ci is a superset of Cj. 

The algorithm is as follows (see Algorithm 1 ). We initialize X^ — T, ci — c, and 
£i = C. At iteration i, consider the cores formed w.rt. c^ and Ci. Let Ti denote a set 
which has exactly one terminal in each core (so the number of cores is \Ti\). The al- 
gorithm finds the maximum radius Ri such that the \Ti\ disks of radius Ri centered at 
each terminal in Ti are non-overlapping w.rt. Vi. By Lemma 1 either the constraint C2 
is tight for a continent S; or the constraint C 1 is tight for a vertex v on the boundary 
of multiple disks. In the former, deactivate every demand with exactly one endpoint in 
core(S'); pay the penalty of such demands and continue to the next iteration with the 
remaining active demands. In the latter, let L„ denote the centers of the disks whose 
boundaries contain v. For every terminal t G L^ buy the shortest path w.rt. Ci connect- 
ing w to r (and so to the core containing r). Deactivate a demand if its endpoints are now 
connected in the solution and continue to the next iteration. The algorithm stops when 
there is no active demand remaining; in which case it returns the final set of vertices 
bought by the algorithm. 



Initialize Xo — T, Ci = £, c\ = c, and i = 1. 
while \Ci\ > do 

Set Ci = c[Xi-i — > 0] and construct the dual program T>i with respect to Ci and d. 
Construct Ti by choosing an arbitrary terminal from each core. 

Let Ri be the maximum radius such that putting a disk of radius Ri centered at every 
terminal in T is feasible w.r.t. T>i. 
if the constraint C2 is tight for a continent S then 
SetX, =X,_i. 

Set£i+i — Ci\{j G [h]\ either Sj £ core(5) ortj G core(S')}. 
else 

Find a vertex v on the boundary of multiple disks for which constraint CI is tight. 
Let Lv denote the centers of the disks whose boundaries contain v. 
Initialize Xi — Xi_i. 
for all r e I/i, do 

Add the shortest path (w.r.t. Ci) between r and v lo Xi. 
Set £,+1 = £,\{j e [h]\d'''+^Sj,tj) = 0}. 
i = i + l. 
Output Xi-\. 



We bound the objective cost of the algorithm in each iteration separately. The fol- 
lowing theorem shows that the fraction of OPT we incur at each iteration is proportional 
to the reduction in the number of cores after the iteration. 

Theorem 2. The approximation ratio of Algorithm 1 is at most 2H2h where H2h is the 
{2hY^ harmonic number 

Proof. Observe that at each iteration, a core is a connected component of the solution 
which contains an endpoint of at least one active demand. We distinguish between two 
types of iterations: In Type I, Line 8 of Algorithm 1 is executed while in Type II, Line 15 
is executed. 

Observe that a demand is deactivated either at Line 8 or at Line 15. In the latter, 
the endpoints of a demand are indeed connected in the solution. Thus we only need to 
pay the penalty of a demand if it is deactivated in an iteration of Type I. Recall that at 
Line 8, the penalty of core(S') is half the total penalty of demands cut by S. Thus the 
total penalty we incur at that line is exactly 2TTCi {S) 

We now break the total objective cost of the algorithm into a payment Pi for each 
iteration i as follows: 

f 2TiCi [S] for Type I iterations executing Line 8 with the continent S\ 

* [ c{Xi) — c(Xi_i) for Type II iterations. 

Recall that I^jI is the number of cores at iteration i. Observe that by Fact 1, at iteration 
i the total dual vector has value Ri\Ti\. By the weak duality Ri < j^. For every i > 1, 
let hi = \Ti\ — |Ti+i I denote the reduction in the number of cores after the iteration i. 

Claim. Pi < 2hiRi for every iteration i. 

Proof. Fix an iteration i. Let y denote the union of disks of radius Ri centered at Ti. 
We distinguish between the two types of the iteration: 

- Type L At Line 8, by deactivating all the demands crossing a core, we essen- 
tially remove that core. Thus in such an iteration hi = 1. The objective cost of 
the iteration is 27r£.(5'). On the other hand, the constraint CI is tight for S, i.e., 
J2s'csy(^) ~ '^Ci{S). By Facts 1 and 3, the radius Ri equals J^S'csVi'^)- 
Therefore the objective cost is at most 2hiRi 

- Type U. At line 15, we connect |L„| cores to each other, thus reducing the number 
of cores in the next iteration by at least /i^ > |Lt,| — 1 '". Recall that by Lemma 1, 
\Ly\ > 2 and hence hi > 1. The total cost of connecting terminals in L^ to v is 
bounded by Ci{v) plus for every r G Ly, the cost of the path connecting t lo v 
excluding Ci{v). Thus Pi < Ci{v) + J^tel i'^'^' (''"' '^) ~ Ci(w)). Now we write the 
equation in Lemma 2 for every disk centered at a terminal in L„: 

|L„|i?, = ^K'(T,t,)-c,(z;)+ J2 y(^)] 

reL^ S\ve6{s),Tes 



E 

reL„ 



K'(t,w)~C,;(v)]+C,;(v)>F„ 



10 



In the special case that every endpoint in the cores become connected to the other endpoint of 
its demand, hi — |I/«|; otherwise /ii = |L„| — 1. 



10 



where the last equahty follows since the constraint CI is tight for v. Since the disks 
are non-overlapping, by Fact 3, y (5) is positive only if it contains a single terminal 
of Ly. This completes the proof since Pi < \Ly\Ri < {hi + l)Ri < 2hiRi. D 

Let X be the final solution of the algorithm. Note that \Ti+i\ = \Ti\ — hi and 
\Ti\ < |T|. A simple calculation shows 

pcsF(x) <^Pi<Yl 2^**^' - 2^^'^ XI ri - ^^^^ ' ^1^1 ■ ° 



3 The Budgeted Steiner Tree Problem 

In this section we consider the Budgeted problem in the node-weighted Steiner tree 
setting. Recall that for a vertex w G F, we denote the prize and the cost of the vertex by 
7r(w) and c{v), respectively. First we generalize the trimming process of Guha et al. [9] 
which reduces the budget violation of a solution while preserving the prize-to-cost ratio. 
We use this process to obtain a bicriteria approximation algorithm for the rooted version 
in Section 3.1. Next, in Section 3.2 we consider the unrooted version. By providing a 
structural property of near-optimal solutions, we propose an algorithm which achieves 
a logarithmic approximation factor without violating the budget constraint; improving 
on the previous result of Guha et al. [9] which obtains an 0(log^ n) -approximation 
algorithm without violation. 

In what follows, for a rooted tree T we assume a subtree rooted at a vertex v consists 
of all vertices whose path to the root of T passes through v. The set of strict subtrees of 
T consists of all subtrees other than T itself. Further, the set of immediate subtrees of 
T are the subtrees rooted at the children of the root of T. 



3.1 The Rooted Budgeted Problem 

For a budget value B and a vertex r, a graph is B-proper for the vertex r if the cost 
of reaching any vertex from r is at most B. The following lemma shows a bicriteria 
trimming method. 

Lemma 3. Let T be a subtree rooted at r with the prize-to-cost ratio 7. Suppose the 
underlying graph is B-proper for r and for e € (0, 1] the cost of the tree is at least ^. 
One can find a tree T* containing r with the prize-to-cost ratio at least ^7 such that 
lB<c{T*)< {l + e)B. 

Proof. Consider T rooted at r. As an initial step, we repeatedly remove a subtree of T 
if (i) the (prize-to-cost) ratio of the remaining tree is at least 7; and (ii) the cost of the 
remaining tree is at least ^. We repeat this until no such subtree can be found. 

If the current cost of T is at most (1 + f)B we are done. Suppose it is not the case. 
A subtree T' is rich if c{T') > f-B and the ratio of T' and all its subtrees is at least 7. 
Indeed the existence of a rich subtree proves the lemma. 

Claim. Given a rich subtree T' , the desired tree T* can be found. 
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Proof. Find a rich subtree T" C T' such that the strict subtrees of T" are not rich, i.e., 
c{T") > ^B while the cost of strict subtrees of T" (if any exists) is less than ^B. Let 
C denote the total cost of the immediate subtrees of T". We distinguish between two 
cases. 

- If C < |i? then we can connect the root of T" directly to r. The cost of the 
resulting tree is at most C + B < (1 + e)B. On the other hand, T" is rich thus 
the prize of T" is at least 7 (|i?). Therefore the resulting tree has the desired ratio 

7e > 7£ 
2(l+£) - 4 ■ 

- If C > |i3, we can pick a subset of immediate subtrees of T" such that their total 
cost is between |i3 and eB. We connect these subtrees to the root by picking the 
path from the root of T" to r. Using the same argument as above, one can show 
that the resulting tree has the desired properties. D 

It only remains to consider the case that no rich subtree exists. Since T is not rich, 
the ratio of at least one subtree is less than 7. Find a subtree T' such that the ratio of 
T' is less than 7 while the ratio of all of its strict subtrees (if any exists) is at least 7. 
Though the ratio of T' is low, we have not removed it in the initial step. Thus the cost of 
T\T' is less than ^B. However, c{T) > (1 + e)B and thus the total cost of immediate 
subtrees of T' is at least ^B. On the other hand the cost of an immediate subtree of T' 
is less than |i3, otherwise it would be a rich subtree. Therefore we can pick a subset of 
immediate subtrees of T' such that their total cost is between |_B and eB. We connect 
these subtrees by connecting the root of T' directly to r. The resulting tree has the cost 
at most {1 + e)B and the prize at least 7 (f S) which completes the proof D 

Moss and Rabani [13] give an 0(log7i)-approximation algorithm for the Budgeted 
problem which may violate the budget by a factor of two. Using Lemma 3 one can 
trim such a solution to achieve a trade-off between the violation of budget and the 
approximation factor . 

Theorem 3. For every e € (0, 1] one can find a subtree T C G in polynomial time such 
that c(T) < (1 + e)B and the total prize ofT is Q[t^^—) fraction of OPT. 

Proof. First we make the graph B proper for the root r by simply discarding the vertices 
which are farther than B from r. Note that these vertices cannot be a part of an optimal 
solution. By Theorem 8, we can find a tree T' with tt{T') > q^^ and c{T') < 2B. 
Suppose the cost of T' is more than (1 + f)B\ otherwise we are done. 

Let 7(T') denote the prize to cost ratio of T' . Observe that 7(T') = ^^ > 
„,. °^. „„ . By Lemma 3 we can trim T' to obtain a subtree T such that 

- The prize to cost ratio of T is 7(T) > f 7(T') > 0(^^)3 - 

- The cost of T is sandwiched between ^B and (1 + f)B. 

Therefore the cost of T does not violate the budget by much and ti{T) is at least 
7r(r)>7(T)(fS)>,^. □ 
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3.2 The Unrooted Budgeted Problem 

We prove a stronger variant of Lemma 3 for the unrooted version. We show that if 
no single vertex is too expensive, one does not need to violate the budget at all. The 
analysis is similar to that of Lemma 3. For the sake of completeness, we have presented 
the proof in Appendix D. 

Lemma 4. Let T be a tree with the prize to cost ratio 7. Suppose c[T) > -j and the 
cost of every vertex of the tree is at most ^ for a real number B. One can find a subtree 
T* C T with the prize to cost ratio at least j such that -j < c{T*) < B. 

One may use arguments similar to that of Theorem 3 to derive an O(logn)- 
approximation algorithm from Lemma 4 when the cost of a vertex is not too big. On 
the other hand if the cost of a vertex is more than half the budget, we can guess that 
vertex and try to solve the problem with the remaining budget. However, one obstacle 
is that this process may need to be repeated, i.e., the cost of another vertex may be more 
than half the remaining budget. Thus we may need to continue guessing many vertices 
in which case connecting them in an optimal manner would not be an easy task. The 
following theorem shows indeed guessing one vertex is sufficient if one is willing to 
lose an extra factor of two in the approximation guarantee. 

Theorem 4. The unrooted budgeted problem admits an 0(log n)-approximation algo- 
rithm which does not violate the budget constraint. 

Proof. We define two classes of subtrees: the flat trees and the saddled trees. A tree is 
flat if the cost of every vertex of the tree is at most ^. For a tree T, let x be the vertex 
of T with the largest cost. The tree T is saddled if c{x) > ^ and the cost of every other 

vertex of the tree is at most — 2^- Let TJ denote the optimal flat tree, i.e., a flat tree 
with the maximum prize among all the flat trees with the total cost at most B. Similarly, 
let T* denote the optimal saddled tree. 

The proof is described in two parts. First we show the prize of the best solution be- 
tween Tf and T* is indeed in a constant factor of OPT. Next, we show by restricting the 
optimum to any of the two classes, an 0(log(n))-approximation solution can be found 
in polynomial time. Therefore this would give us the desired approximation algorithm. 

Claim. Eiflier7r(T;) > ^ or 7r(T;) > ^. 

Proof Let T* denote the optimal tree. If T* consists of only one vertex then clearly it 
is either a flat tree or a saddled tree and we are done. Now assume that T* is neither 
flat nor saddled and it has at least two vertices. Let x and y denote the vertices with the 
maximum cost and the second maximum cost in T* , respectively. Since T* is not flat 
we have c{x) > y and c{y) < ^. It is neither saddled thus c{y) > ^2 ■ Observe 
that the cost of any other vertex of T* is at most ^2 ■ Consider the path between y 
and a; in T*. Let e denote the edge of the path which is adjacent to y. Removing e from 
T* results in the two subtrees Ty and T^ containing y and x, respectively. The cost of 
every vertex in Ty is at most c{y) < ^, thus Ty is flat. On the other hand the cost every 

vertex in T^ except x is at most ~2 ' ^^^^ "^^ i^ saddled. This completes the proof 
since one of the subtrees has at least half the optimal prize tt (T* ) . D 
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Now we only need to restrict the algorithm to flat trees and saddled trees. Indeed 
we can reduce the case of saddled trees to flat trees. We simply guess the maximum- 
cost vertex x (by iterating over all vertices). We form a new instance of the problem by 
reducing the budget to _B — c{x) and the cost of x to zero. The cost of every other vertex 
in T* is at most half the remaining budget, thus we need to look for the best flat tree 
in the new instance. Therefore it only remains to find an approximation solution when 
restricted to flat trees. 

We use Lemma 4 to find the desired solution for flat trees. A vertex with cost more 
than half the budget cannot be in a flat tree, thus we remove all such vertices. We 
may guess a vertex of the best solution and by using the algorithm of Moss and Ra- 
bani [13] (see Theorem 8 in the Appendix) we can find an O (log n) -approximation 
solution which may use twice the budget. Let T be the resulting tree with the total prize 
P. If c(r) < B we are done. Otherwise by Lemma 4 we can trim T to obtain a tree 
with the cost at most B and the prize at least ^ which completes the proof. D 
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A Hardness of Net Worth 

Here we present the hardness resuh for the rooted NW and directed NW given in 
Feigenbaum et al. [6] with slight modifications. We show that NW is NP-hard to ap- 
proximate within any finite factor when restricted to the case of bounded degree graphs. 

Theorem 5. For any e, < e < 1, it is NP-hard to approximate^^ the rooted, whether 
directed or undirected, net worth problem within a ratio e. 

Proof. Given an instance / of 3-S AT, we make an instance J of NW problem such that: 
(i) if / is a yes-instance (i.e., it is satisfiable), then an e-approximation answer to J is 
strictly greater than e; and (ii) if / is a no-instance the the optimal answer to J is at 
most e. Let n and m be the number of variables and clauses in /, respectively. Without 
loss of generality we assume that for every variable x there is a clause x\/ x\n I, thus 
m > n + 1. We make the instance J with four layers of vertices as follows: 

- In the top layer, we put the root r with prize 7r(r) — e. 

- In the second layer, we put a vertex r' with prize zero, connected to r via an edge 
of cost mK — (n + 1) — TO for afix K > n + 1. 

- The third layer contains 2n vertices each for every literal in /, all with prize zero 
and connected to r' via edges of unit cost. 

- The last layer contains m vertices for every clause, all with prize K and connected 
to the vertices corresponding to the literals it contains via edges of unit cost. 

In the case of directed NW, we direct all the edges from top to bottom. We claim that 
if / is satisfiable then NW{J) > I + e, otherwise NW{J) < e. Note that in the 
former an e-approximation algorithm would give us a solution with net worth at least 
e(l + e) > e and in the latter it would give us a solution with net worth at most e, thus 
it can distinguish the satisfiability of /. 

To get a solution with net worth more than e, we have to buy the edge (r, r') and 
thus incurring a big cost. In order to include a subset 5* of vertices of the fourth layer, 
we need to buy at least one edge between layer two and layer three, and |5| edges 
connecting layer three to S. Thus the maximum net worth we could get is 

e + \S\K - 1 - |5| - (mK - (n + 1) - to) = e + \S\{K - 1) - m{K - 1) + n 

= e+{\S\-m){K -l)+n 
< e + n{\S\ - m + 1) 

where the last inequality holds since \S\ — m < and K > n + 1. Therefore to have 
a net worth strictly more than e, we need to include all the vertices in the fourth layer. 
Observer that the maximum possible net worth is e + n. Recall that for every variable 
X there is a clause x V x. To include the vertex corresponding to this clause, we need to 
include at least one vertex corresponding to a literal of x. On the other hand, we cannot 
include more than ji vertices of the third layer or otherwise we could not achieve a 



" Algorithm A approximates function / within ratio e, iff for every input instance x, e/(a;) < 
^{^) ^ f{^)/^- Since NW is a maximization problem, we can assume that A{x) < f{x). 
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net worth more than e. This shows that for every variable x, the vertex r' would be 
connected to exactly one of the vertices corresponding to x and x. Therefore a solution 
of net worth more than e corresponds to a satisfying assignment, in fact, the net worth 
of such a solution would be exactly 1 + e. This completes the proof. D 

B Reductions for Quota Problems 

Here we present two important reductions that were missing in the literature. More 
specifically, we show that rooted and unrooted A;-MST and their fc-Steiner Tree versions 
are all equivalent and indeed equivalent to the Quota problem. (That these are simpler 
than the latter is easy.) These results improve the approximation ratio of fc-Steiner Tree 
from 4 to 2. Ravi et al. [14] had provided a reduction from fc-Steiner Tree to fc-MST 
losing a factor 2, whereas Chudak et al. [4] had conjectured the presence of a (2 + e)- 
approximation algorithm while presenting one with approximation ratio of 5. 

This appendix deals with four special cases of the quota node-weighted Steiner tree 
problem. We first claim that the rooted fc-Steiner tree problem is equivalent to the quota 
problem, with a factor of 1 + e for a polynomially small e. That the former is a special 
case of the latter can be observed easily by setting vertex prizes to and 1 for Steiner 
and terminal nodes, respectively, and looking for a prize of at least fc. To establish the 
other direction of the reduction, given a graph G with prize n and cost c on its vertices, 
as well as target prize value P, we produce an instance of fc-Steiner tree as follows. We 
assume all vertices of G are Steiner vertices and connect a vertex u to q{u) = \ "^j^"-* ] 
new terminal vertices of cost zero. In this instance we let fc = [n/ej . Clearly any 
solution to the quota instance turns into a solution of fc-Steiner tree if one collects the 
terminals immediately connected to the solution vertices. Next consider a solution to 
the fc-Steiner tree instance. We can assume without loss of generality that either none 
or all the terminals connected to one node are in the solution. The solution to the quota 
instance simply includes all Steiner nodes whose all adjacent terminals are picked in 
the k-Steiner tree instance. For such nodes we have J^u li"^) — ^- Note that there are at 
most n such Steiner nodes, and for each of them, say u, we have q{u) ■ — < —+tt{u). 
Therefore, we get ( — ) ^^ q{u) < e^ + X)u ''^{u)- However, the solution guarantee (in 
the k-MST instance) is that the left-hand side is at least (^)fc>(^)(f-l) = F-^. 
Putting these two together and noting that n > 1, we obtain J^u '"'(") > ^(1 ^ 2e). 

The following two theorems show the other three problems are equivalent to fc- 
Steiner tree (and hence quota problem). 

Theorem 6. Let a, < a < 1, be a constant. The following two statements are equiv- 
alent: 

i There is an a-approximation algorithm for the rooted fc-MST problem, 
a There is an a-approximation algorithm for the unrooted fc-MST problem. 

Proof. We note that by running the rooted fc-MST for every vertex, (i) immediately 
implies (ii). To prove that (ii) implies (i), we give a cost-preserving reduction from 
rooted variant to unrooted variant. Let < G = (V,£'),r, fc > be an instance of the 
rooted fc-MST and let ji — \V\. We add n vertices to G, all connected by edges of cost 
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zero to r. Let k' — k + n and consider the solution to (unrooted) /c'-MST on the new 
graph. Since k' > ri — 1, a subtree of size k' has to include r. Thus we can assume that 
there exist an optimal solution which includes all the n extra vertices plus a minimum- 
cost subtree of size k rooted at r. Hence the reduction preserves the cost of optimal 
solution. n 

Theorem 7. Let a, < a < 1, be a constant. The following two statements are equiv- 
alent: 

i There is an a-approximation algorithm for the fc-STEINER Tkyle problem. 
a There is an a-approximation algorithm for the k-MST problem. 

Proof We note that one way is clear by definition. To prove that (ii) implies (i), similar 
to Theorem 6, we give a cost-preserving reduction from fc-SxEiNER Tree to fc-MST. 
Let < G = {V, E),T,k >bs an instance of fc-STEiNER Tree with the set of terminals 
T C V. Let n = \V\. For every terminal Vt E T, add n vertices at distance zero of Vt. 
Let k' = kn -\- k and consider the solution to fc'-MST on the new graph. Any subtree 
with at most fc — 1 terminals have at most {k — l)n + n—l = kn—l vertices. Therefore 
an optimal solution covers at least k terminals. Hence the reduction preserves the cost 
of optimal solution. D 

All the above proofs work, mutatis mutandis, for the edge-weighted case, too. 



C Integrality Gap for Budgeted Steiner Tree 

In this section we discuss the linear programming approach to the Budgeted problem. 
Let Py denote the set of all paths from root to vertex v. We may also assume that all the 
edges have unit length. Consider the flow-based linear programming below. 

maximize. 

We E,veV 
yveV 



E '^^ E fp 




vev p<£P^ 




y I fj) < Xe 


(X) 


pePv-eep 




>>.<! 


(F) 


peP. 




>>e<B 


(B) 


eSB 




fp,Xe>0 





Intuitively, for a path p ending at v, fp denote the total flow reaching v through p and 
Xe denote the maximum flow passing through the edge e. Constraint B keeps the cost 
of edges in budget and constraint F restricts the total flow reaching a vertex. One can 
also write a similar cut-based linear programming. However, we can show that even if 
G is a tree, the gap between the fractional and integral solutions is unbounded. Let G 
be a tree obtained by putting a star at the end of a long path of length B — 1 (see Fig. 2). 
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Let ui, . . . ,Uh denote the leaves other than the root which have 1 unit of profit. Other 
vertices have zero profit. Clearly the optimal integral solution gains one unit of profit. 
Let Pi denote the path from r to u^. Consider a feasible fractional solution where for 
every i, fp. ~ B+^k-i ^'^'^ therefore for every edge e, Xe — B+^k-i ■ ^^ '^^'■^ '■^^'- ^i'^'^s 
there are B + k — 1 edges, we are not exceeding the budget. This shows that the optimal 
fractional solution is at least ^ f^^_j^ and hence in case of B > k, the gap between the 
fractional and the integral solution is k. 



B-1 

I 



• • 




Fig. 2. An example showing the unbounded gap of the LP for the budget problem. 



D Omitted Proofs 

Proof of Lemma 1. Let yi, . . . , y\T'\ denote the disks of radius R centered at the ter- 
minals in T' . Increasing the radius of all disks by any e > creates an infeasibility in 
their union y. Thus at least one of the followings holds for y: 

- The constraint C2 is tight for a set S containing a terminal, i.e., X^s'CS vi^') — 
tt{S) > 0. Let iS* be such a set with the smallest cardinality. Recall that by Fact 3, 
y (5") is positive for a set S" only if S' contains the center of a disk and is a subset of 
the continent of that disk. We remove the zero terms from both sides of the equality. 
The right hand side would be the penalty of a subset of the terminals and the left 
hand side would be the sum over dual variables y(S")'s such that S' is a subset of a 
continent. If the inequality is tight it has to be tight induced to any disk, otherwise 
y is not feasible. Thus the smallest set S is indeed a subset of the continent of a 
disk. Now let S* be the continent of that disk. The sets S and S* share the same 
core, thus the right hand sides of the constraint C2 for both are the same. However 
the the left hand side of the constraint for S* can only be larger which leads to (i). 

- For a vertex v the constraint CI becomes infeasible if we grow every disk by any 
e > 0. The constraint for v is tight w.r.t. y. If the constraint for v is not tight in any 
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of yi's independently, then v is on the boundary of more than one disk which leads 
to (ii). Otherwise assume the constraint for v is tight in y^ for an i G [|T'|]'-. If 
we extend the radius by e, v will be inside the z*'* disk thus X^siueacs) yi('^) ^^^^ 
not change. However by the assumption about v, the same summation for y, i.e., 
X]s|t)G(5(S) y{^) will increase. Therefore a neighbor of v is at most R far from the 
center of another disk, say that of the j*'* disk. By definition, v is on the boundary 
of the j*'* disk. Further, by Proposition 1, v cannot be inside the i*'* disk and so is 
on its boundary which leads to (ii). D 

Proof of Lemma 2. Consider the process of growing the disk during which we start 
with a set S (initialized to the core containing f) and add the vertices for which the 
constraint C 1 becomes tight. Using induction it is easy to show that a vertex u in the 
continent is added to 5* when the total growth passes (P{t,u). Let u be the closest 
neighbor of v to the center, thus R — {dP{t,v) — c{v)) — R ~ <f{t,u). If u is not 
inside the disk, then d'^{t, u) — R and we are done. Otherwise, as soon as the radius of 
the disk passes d'^(i, u), S touches the vertex v and any further growth contributes to 
Ss|t)G(5(s) y{^)- Since at the end v is not inside the disk, this contribution continues 
until the total growth reaches R. Therefore 

^ y{S) = R-d''{t,u) = R-{d''{t,v)-c{v)) □ 

S\v(^5(S) 

Theorem 8 (Theorem 12 of [13]). For an instance of the rooted budgeted problem, an 
O (log n) -approximation solution can be found in polynomial time which uses at most 
twice the budget. 

Proof of Lemma 4. We make T rooted at an arbitrary vertex r. As the first pruning step, 
we repeatedly discard a subtree if the ratio and the cost of the remaining tree does not 
go below 7 and ^, respectively. We stop when no such subtree can be found. Suppose 
the current cost of T is more than B\ otherwise we are done. As in Lemma 3, a subtree 
T' is rich if the ratio of T' and all subtrees of T' is at least 7. Note that one can easily 
check whether a subtree is rich. 

First we show given a rich subtree we can easily find the solution. Observe that all 
the subtrees of a rich subtree are also rich unless their cost is less than ^. Given a rich 
subtree, let T' be its lowest rich subtree, i.e., the cost of any immediate subtree of T' (if 
any exist) is less than ^ . Now let C denote the total cost of all immediate subtrees of 

r. 

- If C < f (or no child exists), then c{T') < ^ since the cost of the root of T' does 
not exceed ^ . Thus T' satisfies the properties desired in the lemma. Recall that T' 
is rich and thus its ratio is at least 7. 

- If C > ^, we can pick a subset of immediate subtrees of T' such that their total 
cost is between ^ and ^ . This can be done since the cost of an immediate subtree 
is at most ^. Let T* be the tree formed by connecting these subtrees to the root of 
T'. Observe that c(T*) < B and the total prize is at least 7r(T*) > jf. Therefore 
the ratio of T* is at least | . 



'^ For an integer x, let [x] denote the set {1, 2, . . . 

19 



It only remains to consider the case that T does not have a rich subtree. Since T is not 
rich, a subtree of T has ratio less than 7. Let T' be a subtree with ratio less than 7 such 
that all strict subtrees of T' (if any exists) have ratio at least 7. Observe that the cost 
of an immediate subtree of T' is less than ^, otherwise it would be a rich subtree. On 
the other hand, we have not discarded T' in the first pruning step, hence c{T\T') < ^. 
Furthermore c{T) > B, thus the total cost of immediate subtrees of T' is at least ^ . 
Now similar to the previous argument, we can pick a subset of immediate subtrees of 
T' such that their total cost is between ^ and y . The tree formed by connecting these 
subtrees to the root of T' has the desired properties. D 
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